080593 

000509 
eemed 

000514 
000517 
00051d 
000521 
000523 
000526 
000528 
00052a 
00052d 
00052f 
000531 
000532 
000534 
000537 
00053d 
000541 
000543 
000546 
000549 
00054c 
0005 4e 
000550 
000552 
000556 
00055a 
00055d 
00055f 
000561 
000567 
000569 
000561b 
00056d 
000570 
000573 
000577 
000579 
00057c 
00057c 



018530020000 
836310bF 
C7432804000000 
3b5328 

0f 8206020000 

f6431040 

742f 

8a4801 

84c9 

7805 

0f h6f9 

eb30 

6a03 

59 

3bdl 
8941)28 

018262010000 

0f 1)67002 

33c9 

8*4801 

B3el71 

clel08 

0bce 

8hf9 

eb0d 

0f b67802 

0f b6 4803 

cle708 

0319 

3bfa 

0f 875a010000 

8bf0 

03c7 

2bd7 

894510 

895514 

16431040 

7438 

8b4b28 

3hf9 

0f861affffff 



and 
mou 
cmp 



mouzx 

jmp 

push 

pop 

cmp 

mou 

jc 

mouzx 

xoi* 

mou 

and 

shl 

or 

mou 

jmp 

rnouzx 

rnouzx 

shl 

add 

cmp 

ja 

mou 

add 

sub 

mou 

mou 

test 



0x3 

dword ptr [ebx+0 
dword ptr [ehx+B 
edx. [e 0x28 
0x723 

byte ptr [ehx+0xj 
0x552 

ax +0x1] 

0x52f 

0x55f 
0x3 

edx, ecx 

0x28 ef 

0x69f 

esi, byte ptr [e 
ecx, ecx 
cl, [eax+0xl] 

0x71 

0x8 
ecx, esi 
edi, ecx 
0x55f 

edi, byte ptr [I 
ecx, byte ptr [| 

0x8 
edi, ecx 
edi. edx 
0x6cl 
esi, eax 
eax. edi 
e dx , edi 
[ebp+0xl0], eax 

0x14 •• 
byte ptr [ebx+ffi 
0x5 bl 

0x28 ] 

0x49e 
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• Two phases: 

1. Embed an unique identifier ("mark") into object 

2. Identify the object by extracting the fingerprint 
mark 



• Fingerprint mark identifies party that uses the object 

• In contrast to watermarking (claim ownership) 

• Software use case: given a copy of the software, find 
out who it has been sold to 



Implementing a Virtual Machine-based Fingerprinting Scheme HORST GORTZ INSTITUT FUR IT-SICHERHEIT | SPRING 9 



RUHR-UNIVERSITAT BOCHUM 

Fingerprinting II 



hg 1SYSSEC 



RUB 



• Three types of fingerprints, determined by extraction 
phase: 

1. Static 

2. Dynamic 

3. Abstract 



• Balance properties: 

1. Stealth 

2. Data Rate 

3. Resilience 
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• Structure commonly used in software protection systems 



• Basic idea: Translate (parts of) native code into a custom 
architecture and embed interpreter (VM) 

• breaks existing tools 

• non-trivial to attack generically 

• hides original semantic and tamper-proofs 



• Set of handlers describe semantics 
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opcode 



32 



FE 



bytecode 



parameters 



0xdead 



0x0f 00 



5H 0x0f00 


0xdeadbeef 


FE 


0xbeef 


0x0fB0 




32 




5H 


0x0f00 


0xcafebabe 




0xdead 


0X0f00 




07 


0xbeef 


0xdead 


5R 


0xb00b 


0xlbadf00d 











5H 


vm_mov_reg_inim 








vm_add_reg_reg 




vm_xor_reg_reg 






FF 


vm_mo\/_reg_reg 



VM context 



entry 



value 



vIP 


[pointer] 


handler tbl 


[pointer] 


native eax 


0xdeadbeef 


native ecx 


0xlbadc0de 







vm_mov_reg_imm 
handler 



-> fetch operands 
calculate 
update ctx 
dispatch next 



handler table 



handler code 
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Implemented Schemes 
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• Based on patent by Davidson and Myhrvold (1996) 

• Embeds the mark in order of basic blocks of a 

function 

• Mark extracted by comparing order in binary to 
canonical ordering 

• But: Prone to subsequent application! 



• Approach here: Embed mark in permutation of handler 
table 

• Subsequent application results in non-functional 
program! 
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Extracted 
Handler Table 



Canonical Form 



Perm. 



0040RFC4 



00407513 



0040645R 



0040699E 



004070R1 



0040640R 



00407F72 



-canonical 




00 


0040640R 


01 


0040645R 


02 


004064RB 


03 


004064FF 


m 


0040654F 


05 


004065R0 






FF 


0040RF72 


2 C 



lookup handler index 



FE 



39 



01 



12 



2R 



00 



42 




Fingerprinted Binary 
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• Based on method by Linn et al., extension by Collberg et 
al. 

• Mark encoded in (unstealthy!) series of unconditional 
branches 

• Branch direction encodes one bit 

• Extraction using Execution Trace 

• Approach here: Transferred verbatim, but extraction phase 
problematic due to VM layer 

• Circumvent VM layer without lowering its security? 

• VM Trapdooring: constant (secret) seed when generating 
components 
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00 


jmp 35 


01 


jmp 07 


02 


jmp 0B 






07 


jmp target 


0B 


jmp 00 






12 


jmp 24 






23 




24 


jmp 02 






35 


jmp 01 



jmp target 



(IR-32) 



mov_reg_imm tmp, target 
mov_reg_reg vIP, tmp 



verify vIP update 
track target immediate 



vm_mov_reg_imm observer 



handler table 



vm_mov_reg_imm <■ 



vm_mov_reg_reg £ 



0040545R 



0040599E 



vm_mov_reg_imm <■ 



0040540H 



vm_mov_reg_reg 



VM code 



virtualized code encoding fingerprint 0b1 01 01 01 



■intercept handler execution 



verify VM sequence 
track dst register 



vm_mov_reg_reg observer 
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• Handler Duplication: duplicate handler code 

• Multiple handlers encode same semantics 

• Multiple opcodes per virtual instruction 

• We have a choice when encoding bytecode 



• Approach here: Group equivalent handlers and assign 
values to each member in a group (cf. Monden et al.) 

• Every encoded virtual instruction embeds a few bits 
based on the handler it chooses 

• Embed mark in all emitted instructions 
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enc. 

bits 



00 



01 
10 



11 



handler table 



opcode 



semantics 



00 


vm_mov_reg_imm 


01 


vm_mov_reg_reg 


02 


vm_add_reg_imm 


03 


vm_and_reg_reg 


04 


vm_mov_reg_imm 


05 


vm_mov_reg_imm 


06 


vm_add_reg_imm 




■ ■ ■ 


FF 


vm_mov_reg_imm 



opcode 



bytecode 



parameters 



99 


0x0f00 


0xdeadbeef 


99 


0xbeef 


0x0f00 




99 








99 


0x0f00 


0xcafebabe 


99 


0xdead 


0x0f00 




99 


0xbeef 


0xdead 




9? 


0xb00b 


0xlbadf00d 


99 








99 


0xdead 


0x0f00 
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Conclusion 
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• Schemes draw from resilience provided by VM 

• Exploit specific VM traits, tied to VM layer 

• Comes at the cost of increased time/space complexity 

• Refrain from protecting performance-critical sections 
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Thank you for your attention! 



Any questions? 



@dwuid 
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